65 research outputs found

    Deep knowledge transfer for generalization across tasks and domains under data scarcity

    Get PDF
    Over the last decade, deep learning approaches have achieved tremendous performance in a wide variety of fields, e.g., computer vision and natural language understanding, and across several sectors such as healthcare, industrial manufacturing, and driverless mobility. Most deep learning successes were accomplished in learning scenarios fulfilling the two following requirements. First, large amounts of data are available for training the deep learning model and there are no access restrictions to the data. Second, the data used for training and testing is independent and identically distributed (i.i.d.). However, many real-world applications infringe at least one of the aforementioned requirements, which results in challenging learning problems. The present thesis comprises four contributions to address four such learning problems. In each contribution, we propose a novel method and empirically demonstrate its effectiveness for the corresponding problem setting. The first part addresses the underexplored intersection of the few-shot learning and the one-class classification problems. In this learning scenario, the model has to learn a new task using only a few examples from only the majority class, without overfitting to the few examples or to the majority class. This learning scenario is faced in real-world applications of anomaly detection where data is scarce. We propose an episode sampling technique to adapt meta-learning algorithms designed for class-balanced few-shot classification to the addressed few-shot one-class classification problem. This is done by optimizing for a model initialization tailored for the addressed scenario. In addition, we provide theoretical and empirical analyses to investigate the need for second-order derivatives to learn such parameter initializations. Our experiments on 8 image and time-series datasets, including a real-world dataset of industrial sensor readings, demonstrate the effectiveness of our method. The second part tackles the intersection of the continual learning and the anomaly detection problems, which we are the first to explore, to the best of our knowledge. In this learning scenario, the model is exposed to a stream of anomaly detection tasks, i.e., only examples from the normal class are available, that it has to learn sequentially. Such problem settings are encountered in anomaly detection applications where the data distribution continuously changes. We propose a meta-learning approach that learns parameter-specific initializations and learning rates suitable for continual anomaly detection. Our empirical evaluations show that a model trained with our algorithm is able to learn up 100 anomaly detection tasks sequentially with minimal catastrophic forgetting and overfitting to the majority class. In the third part, we address the domain generalization problem, in which a model trained on several source domains is expected to generalize well to data from a previously unseen target domain, without any modification or exposure to its data. This challenging learning scenario is present in applications involving domain shift, e.g., different clinical centers using different MRI scanners or data acquisition protocols. We assume that learning to extract a richer set of features improves the transfer to a wider set of unknown domains. Motivated by this, we propose an algorithm that identifies the already learned features and corrupts them, hence enforcing new feature discovery. We leverage methods from the explainable machine learning literature to identify the features, and apply the targeted corruption on multiple representation levels, including input data and high-level embeddings. Our extensive empirical evaluation shows that our approach outperforms 18 domain generalization algorithms on multiple benchmark datasets. The last part of the thesis addresses the intersection of domain generalization and data-free learning methods, which we are the first to explore, to the best of our knowledge. Hereby, we address the learning scenario where a model robust to domain shift is needed and only models trained on the same task but different domains are available instead of the original datasets. This learning scenario is relevant for any domain generalization application where the access to the data of the source domains is restricted, e.g., due to concerns about data privacy concerns or intellectual property infringement. We develop an approach that extracts and fuses domain-specific knowledge from the available teacher models into a student model robust to domain shift, by generating synthetic cross-domain data. Our empirical evaluation demonstrates the effectiveness of our method which outperforms ensemble and data-free knowledge distillation baselines. Most importantly, the proposed approach substantially reduces the gap between the best data-free baseline and the upper-bound baseline that uses the original private data

    ARCADe: A Rapid Continual Anomaly Detector

    Full text link
    Although continual learning and anomaly detection have separately been well-studied in previous works, their intersection remains rather unexplored. The present work addresses a learning scenario where a model has to incrementally learn a sequence of anomaly detection tasks, i.e. tasks from which only examples from the normal (majority) class are available for training. We define this novel learning problem of continual anomaly detection (CAD) and formulate it as a meta-learning problem. Moreover, we propose A Rapid Continual Anomaly Detector (ARCADe), an approach to train neural networks to be robust against the major challenges of this new learning problem, namely catastrophic forgetting and overfitting to the majority class. The results of our experiments on three datasets show that, in the CAD problem setting, ARCADe substantially outperforms baselines from the continual learning and anomaly detection literature. Finally, we provide deeper insights into the learning strategy yielded by the proposed meta-learning algorithm

    Discovery of New Multi-Level Features for Domain Generalization via Knowledge Corruption

    Full text link
    Machine learning models that can generalize to unseen domains are essential when applied in real-world scenarios involving strong domain shifts. We address the challenging domain generalization (DG) problem, where a model trained on a set of source domains is expected to generalize well in unseen domains without any exposure to their data. The main challenge of DG is that the features learned from the source domains are not necessarily present in the unseen target domains, leading to performance deterioration. We assume that learning a richer set of features is crucial to improve the transfer to a wider set of unknown domains. For this reason, we propose COLUMBUS, a method that enforces new feature discovery via a targeted corruption of the most relevant input and multi-level representations of the data. We conduct an extensive empirical evaluation to demonstrate the effectiveness of the proposed approach which achieves new state-of-the-art results by outperforming 18 DG algorithms on multiple DG benchmark datasets in the DomainBed framework.Comment: Accepted at AAAI 2022 (AIBSD Workshop) and ICPR 202

    Performance Evaluation of Pre-computation Algorithms for Inter-Domain QoS Routing

    Get PDF
    International audienceInter-domain QoS routing is a very challenging problem area. This problem combines the complexity of QoS routing, with the limitations of inter-domain routing, such as domain heterogeneity and information confidentiality. The pre-computation offers a very promising solution for addressing this problem. Although the pre-computation scheme has been investigated in several previous studies for a single routing domain, applying pre-computation on an inter-domain level is not straightforward and necessitates deeper investigation. In this work, we study different algorithms for QoS routing based on pre-computation. First, we investigate an exact algorithm. This algorithm provides an optimal solution for the QoS routing problem. However, its application in large scale networks is not always practical. Second, heuristic solutions are also investigated in this work. Particularly, a detailed study of the ID-MEFPA and the ID-PPPA heuristics is provided. Analytical studies and extensive simulations confirm that the exact algorithm achieves the best success rate, but has a very high computational complexity. The ID-MEFPA heuristic has a lower complexity and provides a success rate always close to the exact algorithm. When inter-domain connectivity is high, the ID-PPPA heuristic is the most appropriate with the lowest computation complexity and a success rate very close to the exact algorithm

    Building a binary outranking relation in uncertain, imprecise and multi-experts contexts: The application of evidence theory

    Get PDF
    AbstractWe consider multicriteria decision problems where the actions are evaluated on a set of ordinal criteria. The evaluation of each alternative with respect to each criterion may be uncertain and/or imprecise and is provided by one or several experts. We model this evaluation as a basic belief assignment (BBA). In order to compare the different pairs of alternatives according to each criterion, the concept of first belief dominance is proposed. Additionally, criteria weights are also expressed by means of a BBA. A model inspired by ELECTRE I is developed and illustrated by a pedagogical example

    FRAug: Tackling Federated Learning with Non-IID Features via Representation Augmentation

    Full text link
    Federated Learning (FL) is a decentralized learning paradigm, in which multiple clients collaboratively train deep learning models without centralizing their local data, and hence preserve data privacy. Real-world applications usually involve a distribution shift across the datasets of the different clients, which hurts the generalization ability of the clients to unseen samples from their respective data distributions. In this work, we address the recently proposed feature shift problem where the clients have different feature distributions, while the label distribution is the same. We propose Federated Representation Augmentation (FRAug) to tackle this practical and challenging problem. Our approach generates synthetic client-specific samples in the embedding space to augment the usually small client datasets. For that, we train a shared generative model to fuse the clients knowledge learned from their different feature distributions. This generator synthesizes client-agnostic embeddings, which are then locally transformed into client-specific embeddings by Representation Transformation Networks (RTNets). By transferring knowledge across the clients, the generated embeddings act as a regularizer for the client models and reduce overfitting to the local original datasets, hence improving generalization. Our empirical evaluation on public benchmarks and a real-world medical dataset demonstrates the effectiveness of the proposed method, which substantially outperforms the current state-of-the-art FL methods for non-IID features, including PartialFed and FedBN.Comment: ICCV 202

    Evaluation of an in silico predicted specific and immunogenic antigen from the OmcB protein for the serodiagnosis of Chlamydia trachomatis infections

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The OmcB protein is one of the most immunogenic proteins in <it>C. trachomatis </it>and <it>C. pneumoniae </it>infections. This protein is highly conserved leading to serum cross reactivity between the various chlamydial species. Since previous studies based on recombinant proteins failed to identify a species specific immune response against the OmcB protein, this study evaluated an <it>in silico </it>predicted specific and immunogenic antigen from the OmcB protein for the serodiagnosis of <it>C. trachomatis </it>infections.</p> <p>Results</p> <p>Using the ClustalW and Antigenic programs, we have selected two predicted specific and immunogenic regions in the OmcB protein: the N-terminal (Nt) region containing three epitopes and the C-terminal (Ct) region containing two epitopes with high scores. These regions were cloned into the PinPoint Xa-1 and pGEX-6P-1 expression vectors, incorporating a biotin purification tag and a glutathione-S-transferase tag, respectively. These regions were then expressed in <it>E. coli</it>. Only the pGEX-6P-1 has been found suitable for serological studies as its tag showed less cross reactivity with human sera and was retained for the evaluation of the selected antigens. Only the Ct region of the protein has been found to be well expressed in <it>E. coli </it>and was evaluated for its ability to be recognized by human sera. 384 sera were tested for the presence of IgG antibodies to <it>C. trachomatis </it>by our in house microimmunofluorescence (MIF) and the developed ELISA test. Using the MIF as the reference method, the developed OmcB Ct ELISA has a high specificity (94.3%) but a low sensitivity (23.9). Our results indicate that the use of the sequence alignment tool might be useful for identifying specific regions in an immunodominant antigen. However, the two epitopes, located in the selected Ct region, of the 24 predicted in the full length OmcB protein account for approximately 25% of the serological response detected by MIF, which limits the use of the developed ELISA test when screening <it>C. trachomatis </it>infections.</p> <p>Conclusion</p> <p>The developed ELISA test might be used as a confirmatory test to assess the specificity of serological results found by MIF.</p

    CL-CrossVQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering

    Full text link
    Visual Question Answering (VQA) is a multi-discipline research task. To produce the right answer, it requires an understanding of the visual content of images, the natural language questions, as well as commonsense reasoning over the information contained in the image and world knowledge. Recently, large-scale Vision-and-Language Pre-trained Models (VLPMs) have been the mainstream approach to VQA tasks due to their superior performance. The standard practice is to fine-tune large-scale VLPMs pre-trained on huge general-domain datasets using the domain-specific VQA datasets. However, in reality, the application domain can change over time, necessitating VLPMs to continually learn and adapt to new domains without forgetting previously acquired knowledge. Most existing continual learning (CL) research concentrates on unimodal tasks, whereas a more practical application scenario, i.e, CL on cross-domain VQA, has not been studied. Motivated by this, we introduce CL-CrossVQA, a rigorous Continual Learning benchmark for Cross-domain Visual Question Answering, through which we conduct extensive experiments on 4 VLPMs, 4 CL approaches, and 5 VQA datasets from different domains. In addition, by probing the forgetting phenomenon of the intermediate layers, we provide insights into how model architecture affects CL performance, why CL approaches can help mitigate forgetting in VLPMs to some extent, and how to design CL approaches suitable for VLPMs in this challenging continual learning environment. To facilitate future work on CL for cross-domain VQA, we will release our datasets and code.Comment: 10 pages, 6 figure

    Training and fitness variability in elite youth soccer:perspectives from a difficulty prediction model

    Get PDF
    Research within sport science disciplines seeks to enhance performance via the combination of factors that influences the team’s periodization. The current study aimed to investigate the variations in training load (TL), and the consequential changes in fitness variables, based on the use of match difficulty prediction model (MDP), level of opposition (LOP), days between matches, and match location during 12 weeks in the competitive period I. Seventeen elite soccer players (age = 17.57 ± 0.49 years; body height 1.79 ± 0.05 m; body weight 72.21 ± 6.96 kg), have completed a Yo-Yo intermittent recovery test, a running-based anaerobic sprint test, a soccer-specific repeated sprint ability, and a vertical jump test to identify changes in players fitness. TL was determined by multiplying the RPE of the session by its duration in minutes (s-RPE). Training monotony, strain, and acute:chronic workload ratio (ACWR) were also assessed. A simple regression model was conducted and the highest variances explained (R2) were used. The LOP score explained most of the variance in ACWR (r= 0.606, R2=0.37). TL declined significantly when compared the match-day by the first three days and the last three days of the week. No significant difference was found in s-RPE between the high and low MDP factor. Strong negative correlations were reported between ACWR and LOP (r=-0.714, p<.01). In addition, we found a significant improvement in repeated sprint ability, aerobic and anaerobic fitness variables between pre- and post-test in fatigue index (d=1.104), best testing time, ideal time, total time and mean-best (d=0.518-0.550), and aerobic and anaerobic fitness variables (p<.05), respectively. The MDP could facilitate the training prescription as well as the distribution of training intensities with high specificity, providing a long-term youth player’s development and allowing teams to maintain optimal fitness leading into more difficult matches
    • …
    corecore